# Run this in your Console, NOT your script
install.packages("tidyverse")
install.packages("rio")
install.packages("here")Workshop: Importing & Exporting Data in R
1. Introduction: The Data “Commute”
In our last session, we got RStudio set up. But data rarely starts in R. It lives in Excel files, CSVs, SPSS files, or on the web.
Getting data into R (importing) and getting your results out of R (exporting) is a daily task for any data analyst.
The biggest single frustration for new R users is the “File Not Found” error. This is almost always a Working Directory problem. You try to read a file, but R is “standing” in the wrong folder.
Today, we’re going to learn a workflow that solves this problem permanently and makes importing and exporting a breeze.
2. Setup: Install Your Toolset
We’ll need a few packages. Remember, you only install.packages() once (like downloading an app). We’re installing the whole tidyverse because it’s so common, plus rio and here.
Now, let’s load them in our script (like opening the apps).
# We'll load these at the top of our script
library(tidyverse) # Loads readr, readxl, and more── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.4 ✔ readr 2.1.5
✔ forcats 1.0.0 ✔ stringr 1.5.2
✔ ggplot2 4.0.0 ✔ tibble 3.3.0
✔ lubridate 1.9.4 ✔ tidyr 1.3.1
✔ purrr 1.1.0
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(rio)
library(here)here() starts at /Users/drpakhare/Dropbox/R Workshop 2025 AIIMS Bhopal
3. The “Good”: Base R Functions
R comes with built-in functions for reading data. The most common is read.csv().
- Function:
read.csv() - Pros: It’s built-in, no package needed.
- Cons: It’s slower, can be fussy with data types, and only works for
.csvfiles.
# THE BAD WAY: An "absolute" path
# This code is brittle and will BREAK on your computer.
my_data <- read.csv("C:/Users/Sarah/Desktop/My_Project/data/data.csv")
# THE "OK" WAY: A "relative" path
# This ONLY works if your Working Directory is correct.
my_data <- read.csv("data/data.csv")This is fine, but if you have an Excel file, you’re stuck. This leads us to the next level.
4. The “Better”: The Tidyverse Approach
The tidyverse provides a set of modern, fast, and consistent tools for data import.
read_csv()(from thereadrpackage) is the modern replacement forread.csv().read_excel()(from thereadxlpackage) is the standard for reading Excel files.
A. Reading CSVs with read_csv()
Notice the underscore! read_csv() is much faster and smarter than read.csv().
# The Tidyverse way to read a CSV
my_csv_data <- read_csv("data/my_data.csv")B. Reading Excel with read_excel()
This is the real workhorse. An Excel file can have multiple sheets, so you need to be specific.
# 1. Read the default sheet (usually the first one)
my_excel_data <- read_excel("data/my_workbook.xlsx")
# 2. Read a specific sheet by its name
my_sales_data <- read_excel(
"data/my_workbook.xlsx",
sheet = "Sales_Data"
)
# 3. Read a specific sheet by its position (e.g., the 3rd sheet)
my_inventory_data <- read_excel(
"data/my_workbook.xlsx",
sheet = 3
)This is a huge improvement! The functions are fast and consistent.
But we still have two problems: 1. We’re still vulnerable to the “File Not Found” error if our Working Directory is wrong. 2. We still have to remember which function to use (read_csv, read_excel, read_spss from the haven package, etc.).
5. The “Best”: The rio + here Combo
This is the workflow we recommend for all your projects. It solves both problems at once.
heresolves the “Where is my file?” problem.riosolves the “Which function do I use?” problem.
Step 1: Solving “Where?” with here
The here package has one main job: find your .Rproj file and build a path from there. No matter where your script is, here::here() always starts from the project’s “home base.”
Think of here() as a “Home Base” button in a video game. No matter where you are on the map (e.g., in a scripts/analysis sub-folder), here::here() instantly teleports you back to your home base (your .Rproj file).
From there, you just give simple directions: “go into the data folder and get my_data.csv.”
Step 2: Solving “What?” with rio
The rio package has one “magic” function: import(). It’s a “Universal Translator” for data.
- You give it any file path.
- It looks at the file extension (like
.csv,.xlsx,.sav,.json). - It automatically uses the correct import package (like
readr,readxl,haven) behind the scenes to read the data.
Putting It All Together: The Golden Workflow
Now we combine them. This is the code you should use 99% of the time.
This code is robust, shareable, and works for (almost) any file type.
# --- YOUR NEW WORKFLOW ---
# 1. IMPORTING A CSV FILE
# rio::import() sees ".csv" and uses a fast CSV reader
my_csv <- rio::import(
here::here("data", "my_data.csv")
)
# 2. IMPORTING AN EXCEL FILE (SPECIFIC SHEET)
# rio::import() sees ".xlsx" and uses read_excel()
# It's smart enough to pass the 'sheet' argument!
my_sales <- rio::import(
here::here("data", "my_workbook.xlsx"),
sheet = "Sales_Data"
)
# 3. IMPORTING AN SPSS FILE
# rio::import() sees ".sav" and uses haven::read_spss()
my_spss_data <- rio::import(
here::here("data", "survey_data.sav")
)What about Exporting?
It’s just as easy with rio::export(). rio figures out the file type you want from the file name you provide.
# Take our 'my_sales' R data frame...
# ...and save it as a new, clean CSV file
rio::export(
my_sales,
here::here("output", "clean_sales_data.csv")
)
# ...or save it as a new Excel file
rio::export(
my_sales,
here::here("output", "clean_sales_data.xlsx")
)6. Session Recap: The “Good, Better, Best”
- “Good” (Base R): Use
read.csv(). It works, but it’s basic and fragile. - “Better” (Tidyverse): Use
read_csv()andread_excel(). These are fast and powerful, but you must manage file types and file paths manually. - “Best” (Our Recommendation):
- Always use RStudio Projects (
.Rprojfile). - Always build file paths using
here::here(). - Always use
rio::import()andrio::export()to read and write your files.
- Always use RStudio Projects (
This simple rio + here combo makes your code shareable, robust, and incredibly easy to read.